首页> 外文OA文献 >Linkage disequilibrium mapping via cladistic analysis of phase-unknown genotypes and inferred haplotypes in the Genetic Analysis Workshop 14 simulated data.
【2h】

Linkage disequilibrium mapping via cladistic analysis of phase-unknown genotypes and inferred haplotypes in the Genetic Analysis Workshop 14 simulated data.

机译:通过在基因分析研讨会14模拟数据中对阶段未知的基因型和推断的单倍型进行分类分析,进行连锁不平衡作图。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We recently described a method for linkage disequilibrium (LD) mapping, using cladistic analysis of phased single-nucleotide polymorphism (SNP) haplotypes in a logistic regression framework. However, haplotypes are often not available and cannot be deduced with certainty from the unphased genotypes. One possible two-stage approach is to infer the phase of multilocus genotype data and analyze the resulting haplotypes as if known. Here, haplotypes are inferred using the expectation-maximization (EM) algorithm and the best-guess phase assignment for each individual analyzed. However, inferring haplotypes from phase-unknown data is prone to error and this should be taken into account in the subsequent analysis. An alternative approach is to analyze the phase-unknown multilocus genotypes themselves. Here we present a generalization of the method for phase-known haplotype data to the case of unphased SNP genotypes. Our approach is designed for high-density SNP data, so we opted to analyze the simulated dataset. The marker spacing in the initial screen was too large for our method to be effective, so we used the answers provided to request further data in regions around the disease loci and in null regions. Power to detect the disease loci, accuracy in localizing the true site of the locus, and false-positive error rates are reported for the inferred-haplotype and unphased genotype methods. For this data, analyzing inferred haplotypes outperforms analysis of genotypes. As expected, our results suggest that when there is little or no LD between a disease locus and the flanking region, there will be no chance of detecting it unless the disease variant itself is genotyped.
机译:我们最近描述了一种用于连锁不平衡(LD)映射的方法,该方法在逻辑回归框架中使用阶段性单核苷酸多态性(SNP)单倍型的分类分析。但是,单倍型通常是不可用的,并且不能从未定相的基因型中确定性地推导出。一种可能的两阶段方法是推断多基因座基因型数据的阶段,并像已知的那样分析所得的单倍型。在此,使用期望最大化(EM)算法和每个被分析个体的最佳猜测相位分配来推断单倍型。但是,从相位未知的数据推断单倍型容易出错,在随后的分析中应考虑到这一点。一种替代方法是分析阶段未知的多基因座基因型本身。在这里,我们介绍了针对相已知单倍型数据的方法到未定相SNP基因型情况下的一般化。我们的方法专为高密度SNP数据而设计,因此我们选择分析模拟数据集。初始屏幕中的标记间距太大,无法使我们的方法有效,因此我们使用提供的答案来请求疾病基因座周围区域和无效区域中的进一步数据。对于推断的单倍型和非分期基因型方法,报告了检测疾病基因座的能力,准确定位基因座真实位点的能力以及假阳性错误率。对于此数据,分析推断的单倍型要优于基因型的分析。正如预期的那样,我们的结果表明,当疾病基因座和侧翼区域之间的LD很少或没有LD时,除非对该疾病变体本身进行基因分型,否则将没有机会检测到它。

著录项

  • 作者

    Durrant, C; Morris, AP;

  • 作者单位
  • 年度 2005
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号